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ABSTRACT 

NASA’s Exploration Initiative will require development 
of many new systems or systems of systems. One 
specific example is that safe, affordable, and reliable 
upper stage systems to place cargo and crew in stable 
low earth orbit are urgently required. In this paper, we 
examine the failure history of previous upper stages 
with liquid oxygen (LOX)/liquid hydrogen (LH2) 
propulsion systems. Launch data from 1964 until 
midyear 2005 are analyzed and presented. This data 
analysis covers upper stage systems from the Ariane, 
Centaur, H-IIA, Saturn, and Atlas in addition to other 
vehicles. Upper stage propulsion system elements have 
the highest impact on reliability. This paper discusses 
failure occurrence in all aspects of the operational 
phases (i.e., initial bum, coast, restarts, and trends in 
failure rates over time). In an effort to understand the 
likelihood of future failures in flight, we present 
timelines of engine system failures relevant to initial 
flight histories. Some evidence suggests that propulsion 
system failures as a result of design problems occur 
shortly after initial development of the propulsion 
system; whereas failures because of manufacturing or 
assembly processing errors may occur during any phase 
of the system builds process^ This paper also explores 
the detectability of historical failures. Observations 
from this review are used to ascertain the potential for 
increased upper stage reliability given investments in 
integrated system health management. Based on a clear 
understanding of the failure and success history of 
previous efforts by multiple space hardware 
development groups, the paper will investigate potential 
improvements that can be realized through application 
of system safety principles. 

1. INTRODUCTION 

America’s Vision for Space Exploration, as defined by a 
Presidential announcement of January 14, 2004, calls 
for implementation of sustainable, affordable and 
reliable human programs to explore the Solar System 
and beyond. NASA is moving forward with 
development of new transportation systems to 
accomplish this vision. A new upper stage system is 
required to launch the Crew Exploration Vehicle (CEV) 


and will be developed over the next few years by NASA 
MSFC. 

The NASA Upper Stage System is a part of the crew 
launch system and will be designed to lift approximately 
22 mT into Low Earth Orbit (LEO). As a result, it will 
be the largest upper stage system since the Apollo 
program. The upper stage, as shown in Fig. 1 will have 
a diameter of 5.5 meters. The NASA Upper Stage will 
be powered by a modified SSME that uses liquid 
hydrogen and liquid oxygen as propellants. As a result, 
the upper stage will include a main propulsion system 
that provides propellant feed to the engine and bleeds 
from the engine. Tank pressurization, inert purge, 
hydraulic, electronic and other subsystems will be 
required as well. Although this development will 
leverage the outstanding reliability record of the shuttle 
liquid propulsion system, the propulsion communities’ 
best initiatives to develop a highly reliable upper stage 
are required. 



Fig. 1. New NASA CEV Upper Stage 

2. HISTORY OF UPPER STAGE SYSTEMS 
2.1 Centaur 

The U.S. has operated cryogenic upper stages since 
1963. Centaur was the first cryogenic upper stage and 
evolved versions of Centaur continue in service after 
more than forty years of operating history. The 
evolution of the Centaur upper stage has progressed 
incrementally to the present version (Common Centaur) 
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flown on Atlas IIIA and Atlas V boosters. The 
evolution has been driven by national priorities, the 
commercialization of the Atlas/Centaur launcher after 
Challenger disaster, and the myriad of technology 
advances over four decades. All Centaur versions share 
construction techniques and materials for propellant 
tanks and major subsystems such as propulsion, tank 
pressure control and avionics. The different versions 
are distinguished by tank capacities, shape and booster 
vehicle integration. Centaur versions A, B and C were 
test and development versions that were flown between 
1962 and 1965. 

The single engine Centaur IIIA provided the foundation 
for development of the current version now being flown 
on Atlas IIIB and Atlas V series launch vehicles. The 
Centaur stage was lengthened by about 1.7 meters, 
increasing the propellant load from 16.9 to 20.8 tons. 
Other design changes include reliability enhancements 
associated with the RL- 10-4-2 engine and stage 
structures, changes to reduce parts counts and to 
increase commonality between stage configurations. 
The Common Centaur stage is configurable as either 
single engine or dual engine using a single tank design. 

2.2 Saturn-IV 

The Saturn-IV was the first Saturn upper stage system 
developed. The Saturn-IV consisted of six Pratt & 
Whitney RL-10A-3 engines arranged in a hexagonal 
pattern delivering 41 tons (90,000 lb) thrust. A 
truncated cone-shaped thrust structure transferred 
engine force to the propellant tank walls. Just above the 
engines was an elliptical LOX tank which shared a 
common bulkhead with a cylindrical LH2 tank. 
Propellant tanks were of aluminum construction. Six 
separate LH2 feed lines wrapped around the LOX tank 
and fed each engine. A much more powerful upper 
stage was required when manned lunar landing became 
the U.S. national priority, and NASA resources were 
aligned to meet that goal. The six Saturn-IV upper stage 
flights between 1964 and 1965 were all successful. 

2.3 Saturn-IVB 

Saturn-IVB upper stage utilized a single J-2 Rocketdyne 
LOX/LH2 engine that provided 104 tons (230,000 lb) 
thrust. Propellant storage was cylindrical, constructed 
of aluminum, and consisted of an almost spherical LOX 
tank which shared a common bulkhead with the LH2 
tank. Propellant mass was about 104 tons (230,000 lbs). 
The Saturn-IVB stage had restart capability. Twenty- 
two Saturn-IVB upper stages were launched between 
1966 and 1975 and it proved to be a very reliable 
vehicle. 



Fig. 2. Satum-TV (L) and Saturn-IVB (R) Upper Stages 

2.4 Ariane 

The first successful Ariane 4 flights with the H10+/H10- 
3 upper stages occurred in 1992 and 1995, respectively. 
A new cryogenic upper stage was developed for the 
Ariane 5 launcher. Although derived from the H10 
upper stages, the ESC-A upper stage required a redesign 
of propellant tanks to increase the available propellant 
mass needed to satisfy increasing payload sizes. In 
ESC-A, the LOX and LH2 tanks are separated and do 
not share common bulkheads. The LOX tank is a 
cylindrical aluminum vessel similar to H10. The LH2 
tank is completely redesigned, adapting to the outer 
diameter of the Ariane 5. It is also of aluminum 
construction and uses the same bulkhead technology 
developed for the Ariane 5 booster LH2 tanks. The 
redesigned tanks allowed propellant mass to be 
increased to 14.4 tons. ESC-A is equipped with the 
same HM-7B gas generator engine used on previous 
Ariane upper stages. Engine bum time is extended to 
about 950 seconds with no restart capability. 

2.5 Long March 

After the U.S and the European Union, China became 
the third nation to successfully fly a cryogenic upper 
stage. The CZ-3 upper stage used a small, four 
combustion chamber YF-73 gas generator power cycle 
engine that provided 4.5 tons (9,900 lb) of thrust. The 
system had restart capability. This engine was very 
similar in appearance and performance to the HM-4 
engine that was originally intended for the Ariane 1 
launcher, but was never flown. The CZ-3 upper stage 
tanks carried 8.5 tons (18,700 lb) of propellant. This 
upper stage first flew in 1984; its 13 th and final flight 
was in 2000. 

The CZ-3A (B, C) upper stage was a significant upgrade 
from the original CZ-3 upper stage. Propellant capacity 
was increased by over 100% to 18.2 tons (40,100 lb), 
and the small YF-73 engine was replaced by a much 
more powerful dual thrust chamber YF-75 engine which 
was capable of providing 16 tons (35,300 lb) thrust with 
restart. This upper stage was also designed as the third 
stage for CZ-3B and CZ-3C launchers. The CZ-3A 
upper stage was initially launched in 1994. 
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2.6 H-II 

The Japanese H-II upper stage was a significant scale up 
from the H-I. The common bulkhead propellant tanks 
were lengthened and widened, increasing propellant 
weight to 14 tons (30,900 lbs). A redesigned LE-5A 
hydrogen open expander cycle engine was used, 
boosting thrust to 12.4 tons (27,400 lbs). The H-II 
upper stage logged seven flights with one failure. It was 
first launched in 1994; the last flight occurred in 1999. 

The H-IIA second stage was modified in several ways 
from its H-II precursor. It used a simplified structure 
consisting of separate propellant tanks held together by 
carbon composite support trusses rather than a common 
bulkhead design. The tanks were further enlarged, with 
propellant capacity of 16.6 tons (36,600 lb). The LH2 
tank, is essentially the same structure supplied for Delta 
III/IV upper stages. The propulsion system is 
simplified, utilizing more reliable valves and an 
improved LE-5B engine which provides 14 tons (30,900 
lb) thrust. The H-IIA upper stage inaugural flight 
occurred in 2001. 

2.7 Delta Cryogenic Upper Stage (DCUS) 

DCUS was developed in 1998. The propellant tanks 
carried a 16.8 ton (37,000 lb) propellant load. 
Propellant tanks were separate, self-supporting 
structures for simplified production and reduced 
technical risk. The LH2 tank wets 4 meter (13.1 ft) in 
diameter. Propulsion was provided by a single Pratt & 
Whitney RL-10B-2 engine which produced 11.2 tons 
(24,700 lb) of thrust and was capable of restart in space. 

The DCUS unit was modified for use on the Delta IV 
class launchers. Two Delta IV cryogenic upper stages 
are produced. Both use a 3 meter LOX tank suspended 
beneath the LH2 tank by an intertank truss. The 
propulsion system is unchanged from the Delta III upper 
stage. The Delta IV, 4-meter upper stage is identical to 
the Delta III version but with lengthened tanks for 
greater propellant load (-20% propellant load increase 
from 16.8 to 20.4 tons). A 5-meter diameter version 
with expanded tanks has been developed as upper stage 
for the heavier lift Delta IV launchers. Tanks were 
expanded another 33%, increasing the propellant 
capacity to 27.2 tons. 

2.8 GSLV 

The Geosychronous Satelite Launch Vehicle (GSLV) 
marks the first use of hydrogen/oxygen propulsion in 
India. For the first two flights, ISRO purchased upper 
stages from Khrunichev in Russia, which ironically is 
now the only significant space power not to use 
hydrogen on its own rockets. The stage is designated 


the 12KRB stage in Russia, and is powered by a KVD-1 
engine. 

3. CRYOGENIC UPPER STAGE FAILURE 
HISTORY 

The Upper Stage for the NASA Crew Launch System 
(CLS) will be the largest upper stage developed since 
the Saturn program. All other upper stages have 
demonstrated payload to LEO of less than 10 metric 
tonnes whereas the new CEV upper stage will have a 
capability of approximately two and one-half times the 
other upper stages. As illustrated by Fig. 3, the thrust 
level of the propulsion system will be similar to that of 
the Saturn S-IVB stage. As a result, the upper stage will 
not use evolved subsytems of active flight vehicles. 
Instead, new systems based on legacy Space Shuttle 
main propulsion system will be utilized. Nevertheless, 
the root causes and solutions of historical failures such 
as manufacturing and inspection flaws have some 
importance for the new upper stage system 
development. 



Fig. 3. Comparison of CLS to Other Launch Vehicle 
LOX/LH2 Upper Stages 


Fig. 4 generally supports the conclusion that success 
rates of an upper stage system increase with experience, 
with more failures occurring early in the life of a system 
and increasing reliability as problems are resolved and 
technology improves. Early success rates for Centaur 
are much lower than other systems, but the Centaur 
program provided a learning curve for other systems 
that came into operation decades later. 



Fig. 4. LOX/LH2 Upper Stage Development Timeline 
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Overall upper stage reliability is improving, with 
average failure rates for the last decade of 3%. The 
Centaur upper stage is the reliability leader over the last 
decade with a failure rate of 1.2%. As might be 
expected, the upper stage systems with significantly 
more launches (Centaur and Ariane) have the lowest 
failure rates of all systems over the recent decade. 
These rates are based on 412 launch events many of 
which have multiple engines and many also had 
multiple upper stage bums. Therefore, there are 

significantly more upper stage system starts which 
require propellant management, engine spool-up, 
ignition and acceleration to full power. When 
considering sources of failure, the data shows 85% of 
failures are caused by propulsion system elements; 
engines or propellant supply systems, Fig. 5. Of the 
propulsion related failures, 40% are engine related. 
Although engines are commonly believed to be the 
largest failure source, the data indicate, in cryogenic 
upper stage systems, the engines have been about 33% 
of failures. As shown in Fig. 6, the cryogenic upper 
stage failure sources are widely distributed. A 
significant observation here must be that to maintain 
and improve cryogenic upper stage reliability, attention 
must be focused on all of the stage systems and 
subsystems. 

Non- 



Fig. 5. Distribution of Cryogenic Upper Stage Failures 


the engine in systems such as propellant management, 
hydraulic, avionics, software or other subsystems. In 
this same time period, 41% of the failures are immediate 
and undetectable, whereas 37% are detectable with 
adequate time to initiate a response. The remainder of 
the failures may not be detectable or may not have 
adequate response time, if detected. 

4. RELIABILITY IMPROVEMENT 
INITIATIVES 

Several factors combine to create a failure-averse 
society with high expectations for safety of human 
spaceflight. As examples, commercial aviation operates 
with an outstanding safety record, automobiles are 
increasingly safer and many consumer items such as 
electronic equipment are extremely reliable. Although 
the Challenger and Columbia Space Shuttle accidents 
have sensitized all of us to the risks of spaceflight, in 
general, the American public and many decision-makers 
do not understand the degree of difficulty of spaceflight 
or the extent to which the industry is in its infancy. 

As a result, effective risk management for the technical, 
cost and schedule aspects of the NASA Upper Stage 
system is required. This section outlines initiatives that 
will enable development of a highly reliable Upper 
Stage system including a robust, interactive system 
safety process and reliability analysis process working 
in conjunction with the design process, integrated 
system health monitoring and management systems, 
independent assessments, robust system engineering and 
integration design practices through concurrent 
engineering environments and validated models. 
Project leadership must relentlessly champion all of 
these initiatives with well-organized processes and due 
attention to the organizations safety culture. 

4.1 Concurrent Engineering Environments 



Combustion & 
Energy Control 
Devices 
17 % 


Propulsion Control & 
Gimbal Devices 
24 % 


Propellant Mgmt 
Devices 
13 % 


Guidance and Structural 

Navigational Controls 
8% 

Propellant Tank 
4 % 


Pogo 

13 % 


Fig. 6. Upper Stage Failure Sources are Widely Distributed 

Early detection of potential failures allows in-flight 
remediation action such as engine throttle adjustments, 
engine shutdown or a decision to activate crew escape 
plans. Studies have shown that 70% of all liquid 
propulsion failures from 1964-1995 originated outside 


Information Technology advances have enabled 

innovators in geographically dispersed areas to 
collaborate as never before. Advances such as 

broadband connectivity, digital and mobile 

telecommunications as well as faster and more powerful 
PCs have enabled multidisciplinary analysis, design 
optimization at faster rates and more careful control of 
configuration. As a result, the resources and time 
required to design systems in a concurrent engineering 
environment (CEE) have been significantly reduced 
with improvements to the quality and reliability of the 
final product. The automotive industry was one of the 
leaders in outsourcing and serve as a model for 
aerospace innovation. Within the CEE, several design, 
development, testing and evaluation improvements or 
enhancements are needed. These include: 
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• Design for Six Sigma (DFSS) 

DFSS is a widely recognized statistically based 
process that augments typical systems engineering 
functions. DFSS has been adopted by many key 
technology and aerospace industry companies 
including GE. Notable DFSS product development 
successes at GE include GE Power Systems 
commercialization of highly reliable gas turbines 
with breakthrough fuel efficiency performance, and 
GE Medical Systems introduction of a new 
generation of soft tissue scanning systems. The 
DFSS methodology has been successfully used in 
the development of advanced industrial gas turbine 
sealing systems. Specialized probabilistic tools 
such as Technology Identification, Evaluation, and 
Selection (TIES) have been used within the 
framework of DFSS to provide stochastic 
evaluation and down-selection of multiple 
technology combinations for advanced high-bypass 
jet engines and for space transportation systems. 

• Probabilistic Design and Analysis 
Probabilistic design and analytical techniques can 
add value to the design process, as well. As 
illustrated by Fig. 7 the upper stage preliminary 
design will be evaluated using fault tree and failure 
mode and effects analysis to discern component 
criticality. Non-critical components will be 
designed with traditional deterministic techniques 
whereas critical components will be assessed with a 
probabilistic design approach. Probabilistic design 
approach is most applicable for components that 
exhibit: variation in material properties, critical 
dimensional tolerances, modeling deficiencies, and 
environmental uncertainties. 

• Internal and Independent Technical Reviews (ITR) 
Thorough internal design reviews are needed to 
identify design deficiencies and risks to program 
success. Multiple spacecraft failure board results 
have cited a lack of penetrating reviews as a source 
of error that ultimately contributed to loss of 
spacecraft. Focused ITR’s should evaluate both 
flight and ground systems maturity including 
application and management of redundancy, 
performance margins, and use of heritage hardware 
and validation of its applicability. Ultimately, the 
ITR should assure that the design is maturing 
toward a stable design that meets the design 
requirements with margin. The ITR should operate 
with minimal intrusion into mainstream design 
project work. For both internal and independent 
design and technical reviews, the project must have 
allocated schedule contingency to allow for 
incorporation of review findings and recovery. 
Without this purposeful recovery phase, project 
resistance to identified improvements will naturally 
occur. 



Fig. 7. Candidate Upper Stage Certification Approach 


4.2 Reliability Analysis 

The ARES Cost-Effective Reliability Analysis (CERA) 
determines the distribution of an Upper Stages actual 
reliability throughout the course of the design effort 
allowing adjustments by the project design team. The 
process includes uncertainty in a reliability analysis that 
is associated with the component’s normal variability or 
lack of knowledge. Best engineering judgment, 
surrogate data, test data, or actual component data is 
used with standard Bayesian processes to determine the 
best representation of a component’s failure probability 
or failure rate. The reliability analysis determines the 
major reliability drivers based on either the uncertainties 
associated with the component’s failure rate, or how the 
components operate within the designed system. By 
addressing and managing the major reliability drivers 
one can identify the most cost-effective way to improve 
the system’s reliability. 

4.3 Damage Tolerance Testing 

Although the cost of the time required to demonstrate 
high reliability through testing alone is prohibitive, 
testing has a significant role in reliability improvement. 
As an example, damage tolerance testing from the DOD 
Engine Structural Integrity Program (ENSIP) program 
has been successfully used to validate models used in 
the probabilistic design of critical components. Damage 
tolerance testing may include evaluation of pre-flawed 
components to evaluate crack growth and design 
sensitivity to crack size. Fig. 8 compares commercial 
and military engine rotor failures and demonstrates the 
application of damage tolerant design principles and 
testing has reduced the number of distress events 
occurring in military engine rotors. The large reduction 
noted in military engines’ rotors failures starting in 1979 
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is credited to the ENSIP program. Similarly, the large 
reduction in this same metric in commercial engines in 
1986 is apparently a result of the ENSIP approaches 
being applied to commercial engines approximately 
eight years later, a typical military to commercial 
technology transition time. 



Fig. 8. Air Force Engine Development 

The DoD’s ENSIP process is quite complex and is the 
result of extensive development. Although this 
approach should yield major reliability improvements 
for the upper stage system development, adequate 
preparation for the NASA-contractor team is required 
before ENSIP type requirements are imposed on a new 
development program. The failure modes for space 
systems are not the same as aircraft systems and the 
environments are very different. A pilot advanced 
development project to explore and develop similar 
concepts for space systems would provide the readiness 
to use a Space ENSIP for future Exploration systems. 

4.4 Integrated System Health Monitoring and 
Management 

Integrated System Health Monitoring and Management 
(ISHM) is an emerging competency for all highly 
reliable flight systems. Experiences from all U.S. 
manned space programs accidents demonstrate that 
complete, accurate assessment of vehicle conditions is 
required to assure crew safety and mission success. 
While potential benefits of this system are easily 
recognized, further development efforts are required. 
NASA is currently funding technology development 
efforts led by Penn State University and NASA SSC. 
Incremental development of the ISHM provides an 
opportunity to mature the system before application to 
active flight systems. Key areas for future development 
include: selection of ISHM architecture(s), smart 
sensors, health detection algorithms, and 
communication protocols. Methods for validation of 
these systems prior to activation in a flight system are 
also needed. 

Propulsion ground test facilities offer an excellent and 
realistic venue for large-scale health monitoring 
technology demonstrations and afford the opportunity 


for inducing failures that must be detected and acted on 
to maintain safe conditions. Health management should 
incorporate fault diagnostics that are capable of 
detecting, isolating and identifying faults in subsystems 
or components. Information produced by the fault 
diagnostics process in the control of the ground test 
systems may then be utilized to develop integrated fault- 
adaptive control approaches for flight vehicles. The 
Space Shuttle and other flight vehicles offer potential as 
technology demonstrators as well. 

4.5 Organizational Attributes for Design of Highly 
Reliable Launch Systems 

One of the primary drivers for highly reliable designs is 
the structure and safety culture of the design 
organization. Studies have shown that well-defined 
organizational processes allow effective communication 
between contributing design team members. As an 
example, the decision-making process is well 
documented and well understood by all design team 
members. This process includes ample time for 
discussion and debate of competing concepts and a 
positive view of system safety. Roles and 
responsibilities of all team members are defined and 
understood by others. 

Design organizations that have a strong, interactive 
system safety culture tend to produce highly reliable 
designs. Based on studies after the Chernobyl nuclear 
accident and Challenger tragedy in 1986, safety culture 
is the enduring value and priority placed on safety by 
every design team member. It refers to the extent to 
which individuals and groups commit to personal 
responsibility for risk reduction and will act to preserve, 
enhance and communicate system safety. Safety culture 
is reflected in an organizations willingness to develop 
and learn from errors or near misses. Further, this 
safety culture recognizes and encourages contributions 
from all team members, but is championed by project 
leadership. It is relatively enduring, stable and resistant 
to change. 

Indicators that help measure the strength of these 
attributes include organizational commitment, 
leadership and management involvement, employee 
empowerment, a strong rewards system and reporting 
systems. 

5. SUMMARY AND CONCLUSIONS 

This paper traces the historical development of 
cryogenic upper stage systems and compares previous 
efforts with the new NASA Upper Stage. It outlines 
important initiatives to assure the reliability of the new 
NASA Upper Stage including integrated system health 
monitoring and management, damage tolerance testing, 
technical reviews, probabilistic design and analysis, a 
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robust reliability analytical process that is closely- 
coupled to the design process, and a sound design 
organizational structure that embraces system safety as 
an integral aspect of the design process. 
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